Python
Unlike the other project this was a brief introduction to Data Analysis, in which i learned how to do so through finishing a few steps and tasks and learning how to ask the right questions.
Unlike the other project this was a brief introduction to Data Analysis, in which i learned how to do so through finishing a few steps and tasks and learning how to ask the right questions.
Unlike the other project this was a brief introduction to Data Analysis, in which i learned how to do so through finishing the following tasks:
1) Instruction ( For Data Cleaning ) - Remove the column that only contains missing values.
2) Question ( Based on Filtering + Value Counts ) - For Speeding , were Men or Women stopped more often ?
3) Question ( Groupby ) - Does gender affect who gets searched during a stop ?
4) Question ( mapping + data-type casting ) - What is the mean stop_duration ?
5) Question ( Groupby , Describe ) - Compare the age distributions for each violation.
1) Convert the Datatype of 'Date' column to Date-Time format.
2) Add a new column ''year'' in the dataframe, which contains years only.(B.2) Add a new column ''month'' as 2nd column in the dataframe, which contains month only.
3) Remove the columns 'year' and 'month' from the dataframe.
4) Show all the records where 'No. of Crimes' is 0. And, how many such records are there ?
5) What is the maximum & minimum 'average_price' per year in england ?
6) What is the Maximum & Minimum No. of Crimes recorded per area ?
7) Show the total count of records of each area, where average price is less than 100000.
1) Instruction ( For Data Cleaning ) - Find all Null Values in the dataset. If there is any null value in any column, then fill it with the mean of that column.
2) Question ( Based on Value Counts )- Check what are the different types of Make are there in our dataset. And, what is the count (occurrence) of each Make in the data ?
3) Instruction ( Filtering ) - Show all the records where Origin is Asia or Europe.
4) Instruction ( Removing unwanted records ) - Remove all the records (rows) where Weight is above 4000.
5) Instruction ( Applying function on a column ) - Increase all the values of 'MPG_City' column by 3.
1) Show the number of Confirmed, Deaths and Recovered cases in each Region.
2) Remove all the records where the Confirmed Cases is Less Than 10.
3) In which Region, maximum number of Confirmed cases were recorded ?
4) In which Region, minimum number of Deaths cases were recorded ?
5) How many Confirmed, Deaths & Recovered cases were reported from India till 29 April 2020 ?
6-A ) Sort the entire data wrt No. of Confirmed cases in ascending order.
6-B ) Sort the entire data wrt No. of Recovered cases in descending order.
1) How will you hide the indexes of the dataframe.
2) How can we set the caption / heading on the dataframe.
3) Show the records related with the districts - New Delhi , Lucknow , Jaipur.
4) Calculate state-wise :
A. Total number of population.
B. Total no. of the population with different religions.
5) How many Male Workers were there in Maharashtra state ?
6) How to set a column as index of the dataframe ?
7) A. Add a Suffix to the column names.
B. Add a Prefix to the column names.
Task. 1) Is there any Duplicate Record in this dataset ? If yes, then remove the duplicate records.
Task. 2) Is there any Null Value present in any column ? Show with Heat-map.
1) For 'House of Cards', what is the Show Id and Who is the Director of this show ?
2) In which year the highest number of the TV Shows & Movies were released ? Show with Bar Graph.
3) How many Movies & TV Shows are in the dataset ? Show with Bar Graph.
4) Show all the Movies that were released in year 2000.
5) Show only the Titles of all TV Shows that were released in India only.
6) Show Top 10 Directors, who gave the highest number of TV Shows & Movies to Netflix ?
7) Show all the Records, where "Category is Movie and Type is Comedies" or "Country is United Kingdom".
8) In how many movies/shows, Tom Cruise was cast ?
9) What are the different Ratings defined by Netflix ?
9.1) How many Movies got the 'TV-14' rating, in Canada ?
9.2) How many TV Shows got the 'R' rating, after year 2018 ?
10) What is the maximum duration of a Movie/Show on Netflix ?
11) Which individual country has the Highest No. of TV Shows ?
12) How can we sort the dataset by Year ?
13) Find all the instances where: Category is 'Movie' and Type is 'Dramas' or Category is 'TV Show' & Type is 'Kids' TV'.
1) What are all different subjects for which Udemy is offering courses ?
2) Which subject has the maximum number of courses.
3) Show all the courses which are Free of Cost.
4) Show all the courses which are Paid.
5) Which are Top Selling Courses ?
6) Which are Least Selling Courses ?
7) Show all courses of Graphic Design where the price is below 100 ?
8) List out all the courses that are related to 'Python'.
9) What are courses that were published in the year 2015 ?
10) What is the Max. Number of Subscribers for Each Level of courses ?
1) Find all the unique 'Wind Speed' values in the data.
2) Find the number of times when the 'Weather is exactly Clear'.
3) Find the number of times when the 'Wind Speed was exactly 4 km/h'.
4) Find out all the Null Values in the data.
5) Rename the column name 'Weather' of the dataframe to 'Weather Condition'.
6) What is the mean 'Visibility' ?
7) What is the Standard Deviation of 'Pressure' in this data?
8) What is the Variance of 'Relative Humidity' in this data ?
9) Find all instances when 'Snow' was recorded.
10) Find all instances when 'Wind Speed is above 24' and 'Visibility is 25'.
11) What is the Mean value of each column against each 'Weather Condition ?
12) What is the Minimum & Maximum value of each column against each 'Weather Condition ?
13) Show all the Records where Weather Condition is Fog.
14) Find all instances when 'Weather is Clear' or 'Visibility is above 40'.
15) Find all instances when :
A. 'Weather is Clear' and 'Relative Humidity is greater than 50'
or
B. 'Visibility is above 40'
C Language is one of the most compliacted but fascinating programming languages out there, as it handles problems in a very different way to Python, we can descibe it as a computer language or a "hardware language", because it is crucial that one should know how memory works and other components as well, since this will give the person an idea on how to write the code efficently.
So now i will list below the two main issues i worked on in order to get used to the language and its functionalities.
FYI: these problems were suggested to me by a friend who lives in Germany so the comments are in German